Federated Self-training for Semi-supervised Audio Recognition

نویسندگان

چکیده

Federated Learning is a distributed machine learning paradigm dealing with decentralized and personal datasets. Since data reside on devices such as smartphones virtual assistants, labeling entrusted to the clients or labels are extracted in an automated way. Specifically, case of audio data, acquiring semantic annotations can be prohibitively expensive time-consuming. As result, abundance remains unlabeled unexploited users’ devices. Most existing federated approaches focus supervised without harnessing data. In this work, we study problem semi-supervised models via self-training conjunction learning. We propose FedSTAR exploit large-scale on-device improve generalization recognition models. further demonstrate that self-supervised pre-trained accelerate training models, significantly improving convergence within fewer rounds. conduct experiments diverse public classification datasets investigate performance our under varying percentages labeled Notably, show little 3% available, average rate by 13.28% compared fully model.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semi-supervised PCA-Based Face Recognition Using Self-training

Performances of face recognition systems based on principal component analysis can degrade quickly when input images exhibit substantial variations, due for example to changes in illumination or pose, compared to the templates collected during the enrolment stage. On the other hand, a lot of new unlabelled face images, which could be potentially used to update the templates and re-train the sys...

متن کامل

Semi-Supervised Model Training for Unbounded Conversational Speech Recognition

For conversational large-vocabulary continuous speech recognition (LVCSR) tasks, up to about two thousand hours of audio is commonly used to train state of the art models. Collection of labeled conversational audio however, is prohibitively expensive, laborious and error-prone. Furthermore, academic corpora like Fisher English (2004) or Switchboard (1992) are inadequate to train models with suf...

متن کامل

Deep Co-Training for Semi-Supervised Image Recognition

In this paper, we study the problem of semi-supervised image recognition, which is to learn classifiers using both labeled and unlabeled images. We present Deep Co-Training, a deep learning based method inspired by the Co-Training framework [1]. The original Co-Training learns two classifiers on two views which are data from different sources that describe the same instances. To extend this con...

متن کامل

A Self-training Method for Semi-supervised Gans

Since the creation of Generative Adversarial Networks (GANs), much work has been done to improve their training stability, their generated image quality, their range of application but nearly none of them explored their self-training potential. Self-training has been used before the advent of deep learning in order to allow training on limited labelled training data and has shown impressive res...

متن کامل

Semi-Supervised Self-training Approaches for Imbalanced Splice Site Datasets

Machine Learning algorithms produce accurate classifiers when trained on large, balanced datasets. However, it is generally expensive to acquire labeled data, while unlabeled data is available in much larger amounts. A cost-effective alternative is to use Semi-Supervised Learning, which uses unlabeled data to improve supervised classifiers. Furthermore, for many practical problems, data often e...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: ACM Transactions in Embedded Computing Systems

سال: 2022

ISSN: ['1539-9087', '1558-3465']

DOI: https://doi.org/10.1145/3520128